首页> 外文OA文献 >Automatic Grader of MT Outputs in Colloquial Style by Using Multiple Edit Distances
【2h】

Automatic Grader of MT Outputs in Colloquial Style by Using Multiple Edit Distances

机译:通过使用多个编辑距离以口语形式自动输出MT口语定级器

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper addresses the challenging problem of automating the human's intelligent ability to evaluate output from machine translation (MT) systems, which are subsystems of Speech-to-Speech MT (SSMT) systems. Conventional automatic MT evaluation methods include BLEU, which MT researchers have frequently used. BLEU is unsuitable for SSMT evaluation for two reasons. First, BLEU assesses errors lightly at the beginning or ending of translations and heavily in the middle, although the assessments should be independent from the positions. Second, BLEU lacks tolerance in accepting colloquial sentences with small errors, although such errors do not prevent us from continuing conversation. In this paper, the authors report a new evaluation method called RED that automatically grades each MT output by using a decision tree (DT). The DT is learned from training examples that are encoded by using multiple edit distances and their grades. The multiple edit distances are normal edit dista nce (ED) defined by insertion, deletion, and replacement, as well as extensions of ED. The use of multiple edit distances allows more tolerance than either ED or BLEU. Each evaluated MT output is assigned a grade by using the DT. RED and BLEU were compared for the task of evaluating SSMT systems, which have various performances, on a spoken language corpus, ATR's Basic Travel Expression Corpus (BTEC). Experimental results showed that RED significantly outperformed BLEU.
机译:本文解决了具有挑战性的问题,即要使人类具有自动能力来评估机器翻译(MT)系统的输出,该系统是语音到语音MT(SSMT)系统的子系统。传统的MT自动评估方法包括BLEU,MT研究人员经常使用BLEU。 BLEU不适合进行SSMT评估有两个原因。首先,BLEU在翻译开始或结束时轻度评估错误,在翻译中轻度评估错误,尽管评估应独立于职位。其次,BLEU在接受带有微小错误的口语句子时缺乏容忍度,尽管这种错误不会阻止我们继续进行对话。在本文中,作者报告了一种称为RED的新评估方法,该方法通过使用决策树(DT)自动对每个MT输出进行评分。 DT是从训练示例中学到的,这些训练示例是使用多个编辑距离及其等级进行编码的。多个编辑距离是通过插入,删除和替换以及ED的扩展定义的普通编辑距离(ED)。与ED或BLEU相比,使用多个编辑距离可提供更大的容差。使用DT为每个评估的MT输出分配一个等级。比较了RED和BLEU在口语语料库ATR的基本旅行表达语料库(BTEC)上评估具有各种性能的SSMT系统的任务。实验结果表明,RED的性能明显优于BLEU。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号